ACS Synthetic Biology — Latest Matching Preprints

1

Multi-Agent Dynamic Refinement Outperforms Static RAG in Clinical Reasoning for Complex Nephrology Cases

Yano, Y.; Kakizaki, H.; Nagasu, H.; Kishi, S.; Koshida, T.; Nihei, Y.; Hirano, A.; Sugawara, Y.; Imaizumi, T.; Osakabe, Y.; Sakaguchi, Y.; Nangaku, M.; Mori, H.; Naito, T.; Ohashi, M.; Maruyama, S.; Matsui, I.; Isaka, Y.; Okada, H.; Suzuki, Y.; Kashihara, N.

2026-07-16 nephrology 10.64898/2026.07.15.26358121 medRxiv

Top 12%

0.1%

Show abstract

Background: Large language models (LLMs) struggle with dynamic, longitudinal clinical reasoning. We developed a Multi-Stage Iterative Clinical Reasoning Agent framework to address this gap and systematically decouple the clinical efficacy of static retrieval-augmented generation (RAG) from dynamic self-refinement. Methods: Ten complex longitudinal nephrology cases, rigorously selected via a modified Delphi consensus technique, were blindly evaluated by four board-certified nephrologists and a multi-model AI panel. We compared three architectures across nine cognitive steps: (Model A) a baseline frontier LLM, (Model B) an LLM augmented with static guideline-based RAG, and (Model C) our proposed multi-agent framework featuring RAG integrated with iterative self-critique and refinement. Results: In human evaluations (20-point scale), Model C (mean 17.2, SD 1.2) significantly outperformed both Model A (16.1, 1.3) and Model B (16.2, 1.2) (P < 0.001). Implementing static RAG (Model B) yielded no significant improvement over the baseline. Automated AI evaluations (15-point scale) corroborated these findings: Model C (14.7, 0.6) outscored Model A (14.2, 0.9, P < 0.001) and Model B (14.3, 0.9, P = 0.01). While monolithic models exhibited severe score degradations in planning-heavy tasks such as dynamic differential diagnoses, the multi-agent framework effectively intercepted error cascades, achieving significantly higher diagnostic accuracy (mean 17.6, P = 0.019) and therapeutic management scores (17.3, P = 0.002). Conclusions: Static knowledge retrieval alone fails to enhance frontier LLM performance in longitudinal medical reasoning. Distributing clinical workflows into a multi-agent dynamic refinement pipeline significantly improves reasoning completeness, intercepts error cascades, and safely resolves planning bottlenecks in complex patient care.

2

A ReAct Agentic AI System for Natural Language Querying and Statistical Analysis of The Cancer Genome Atlas Clinical Data

Korutla, R.; Amal, S.

2026-07-17 health informatics 10.64898/2026.07.15.26358188 medRxiv

Top 13%

0.1%

Show abstract

The Cancer Genome Atlas (TCGA) holds clinical data for over 11,000 patients across 33 cancer types, but access is hard because of complex file structures, heterogeneous formats, and the need for programming. We present an agentic system for natural language querying and statistical analysis of TCGA clinical data. The system uses a large language model as an autonomous ReAct agent that selects from eight computational tools, including data extraction, descriptive statistics, Kaplan-Meier survival analysis with log-rank tests, hypothesis testing, and verification against the curated TCGA Pan-Cancer Clinical Data Resource (CDR). The agent reasons about intermediate results, adapts its approach, and returns clinically contextualized responses with source attribution and auditable traces. We introduce TCGA-Agent-Bench, 440 queries across five difficulty tiers with ground truth from the independently curated TCGA-CDR, evaluated with dual metrics of numerical accuracy and clinical completeness. The system achieves 93.4% overall accuracy (100% single-patient lookups, 99.1% cohort statistics, 92.8% comparative analyses), outperforming a fixed rule-based pipeline (87.1%), a single-pass LLM (81.8%), and retrieval-augmented generation (66.9% on a subset). Most of the benchmark is answerable from the CDR alone, so we locate the extraction layer's value in fields the CDR lacks (drug treatments, TNM components, biomarkers, biospecimen metadata): on 26 queries targeting these, the full system answers 100% versus 3.8% for CDR-only. Ablations show the reasoning loop is most impactful (+9.1% accuracy, +22.0 completeness points). A tool-based agentic architecture enables accurate, auditable analysis of clinical repositories, with value driven by tool design and recovered fields rather than model scale.

3

Molecular and phylogenetic insights into the novel Brugia sp. in Sri Lanka with new evidence for zoonotic transmission

Nimalrathna, S. U.; Harischandra, H.; Kimber, M.; Chandrasena, N.; De Silva, N.; Mallawarachchi, H.; De Silva, B. G. D. N. K.

2026-07-21 infectious diseases 10.64898/2026.07.20.26358473 medRxiv

Top 16%

0.0%

Show abstract

The World Health Organization (WHO) validated Sri Lanka had eliminated lymphatic filariasis as a public health problem in 2016, the second country in Southeast Asia to attain this status. However, post-validation surveillance has identified sporadic cases of brugian filariasis. The reemergence of Brugia malayi infections in Sri Lanka warrants urgent investigations. Recent studies have shown that the parasite responsible for the reemergence is a novel zoonotic Brugia sp. maintained among dogs that is closely related but distinct to the human-infecting B. malayi species. The current study employed morphological and morphometric assessments, revealing that this novel zoonotic Brugia sp. is within the B. malayi morphological range. Molecular characterization of three genomic regions, the nuclear genomic region SLXI, the non-coding region HhaI, and the mitochondrial genomic region COXI confirmed it as a genetic variant more closely related to B. malayi than to B. pahangi. Phylogenetic analysis further indicated it as a distinct genomic variant, closely related to a B. malayi-like parasite reported from India. Notably, that same parasite was identified in infected humans, animals, and potential vector mosquitoes. This, together with the detection of both human and animal blood within the same brugian infective mosquitoes, and delineating the canine origin of the parasites in human infections, provides compelling evidence supporting zoonotic transmission of this parasite. To our knowledge, this is the first report demonstrating the presence of the same brugian parasite in humans, domestic animals, and potentially infective mosquitoes in Sri Lanka, supported by multi-genomic evidence. The recent identification of multiple potential mosquito vector species suggests that this parasite may have undergone adaptive changes, facilitating its ability to overcome the species barrier. These findings substantiate the long-held hypothesis of zoonotic transmission of the reemerged brugian parasite, highlighting significant implications for ongoing surveillance and control strategies.

4

Bridging surveillance gaps in dengue: a hierarchical model integrating mixed data sources for transmission estimation and vaccine targeting

Djaafara, B. A.; Elyazar, I. R.; Yosephine, P.; Surya, A.; Silalahi, F. S.; Handito, A.; Thohir, B.; Aryani, D.; Gunawan, D.; Nisa, A. K.; Prianto, E.; Samad, I.; Cook, A. R.; Huang, A. T.; Clapham, H. E.; Bhatt, S.; Mishra, S.

2026-07-17 epidemiology 10.64898/2026.07.15.26358208 medRxiv

Top 16%

0.0%

Show abstract

Estimating dengue force of infection (FOI) is essential for understanding transmission dynamics and targeting intervention programmes, yet surveillance data in endemic settings required for estimations are often incomplete, with varying formats. We developed a Bayesian hierarchical catalytic model that jointly fits age-stratified case data, aggregate case data, and seroprevalence surveys within a single framework, incorporating external covariates to improve parameter identifiability. Synthetic validation showed that covariates alone recovered accurate FOI point estimates even when most districts contributed only aggregate data, but did so with poorly calibrated uncertainty; anchoring the model with a single seroprevalence survey was necessary to bring credible interval coverage close to nominal. Applied to 128 districts across Java and Bali, Indonesia (2016-2024), the model revealed substantial spatial heterogeneity in FOI and reporting rates. Many districts in Java exceeded the WHO-suggested seroprevalence threshold for vaccine introduction, yet were classified as low-priority when using reported incidence as prioritisation criterion, particularly in areas with weak surveillance. Model-based seroprevalence estimation, integrating multiple data sources, offers a more consistent basis for identifying high-priority districts for vaccine introduction, and is less susceptible to surveillance bias than reported incidence.

5

Multilevel Factors Associated with Nonresponse to Patient-Reported Outcome Measures in Routine Radiation Oncology Care

Liu, J. B.; Chen, Y.-J.; Edelen, M. O.; Pusic, A. L.; Martin, N. E.; Zeng, C.

2026-07-17 health systems and quality improvement 10.64898/2026.07.15.26358162 medRxiv

Top 16%

0.0%

Show abstract

Purpose: Nonresponse to routinely collected patient-reported outcome measures (PROMs) threatens the representativeness of aggregated data. We characterized patient-, provider-, and clinic-level factors associated with PROMIS Global-10 nonresponse in routine radiation oncology care. Methods: In this retrospective cohort study, all adults seen at five Mass General Brigham radiation oncology clinics over one year were included. The primary outcome was patient-level nonresponse, defined as never completing the portal-administered Global-10 versus completing it at least once. Using iterative mixed-effects logistic regression, we modeled patient-, provider-, and clinic-level factors. Results: Among 12,214 patients, 71 providers, and five clinics, patient- and appointment-level response rates were 35.4% and 10.9%, with patient-level response ranging nearly fivefold across clinics (12.8% to 66.2%). In Model 1, male sex, lower education, not working, and recent surgery had higher odds of nonresponse, and longer time since diagnosis lower odds. After provider- and clinic-level factors were added, patient sex, education, and employment became nonsignificant, whereas recent surgery (adjusted odds ratio [aOR] 1.97) and longer time since diagnosis (aOR 0.46 for >12 months) persisted. A provider's historical collection rate was protective but attenuated at the clinic level. There, a later program launch (aOR 0.29) and higher historical collection rate (aOR 0.79) correlated with lower nonresponse, whereas academic versus community setting did not. Conclusions: Nonresponse to routinely collected PROMs is a multilevel phenomenon driven substantially by clinic-level implementation factors, not patient characteristics alone. Because response rate is only a proxy for representativeness, PROMs programs and PRO-based performance measures should prioritize representative collection over volume.

6

Rationale and guidance for implementing the continual reassessment method for dose-finding in controlled human infection model studies

Weerasinghe, C.; Osowicki, J.; Simpson, J. A.; Crocker-Buque, T.; McCarthy, J.; Williams, E.; Price, D. J.

2026-07-17 infectious diseases 10.64898/2026.07.16.26358128 medRxiv

Top 16%

0.0%

Show abstract

Controlled human infection models (CHIMs) are increasingly used in infectious disease research to study pathogen dynamics and evaluate interventions under controlled conditions. However, these studies are resource-intensive and involve ethical and safety constraints, making efficient study design critical. Dose-finding is a key early component in CHIMs, where the aim is to identify a challenge dose that achieves a target infection probability. Traditional rule-based designs are commonly used but can be inefficient, motivating the use of model-based adaptive approaches such as the Bayesian Continual Reassessment Method (CRM). Although CRM has been extensively studied and widely adopted in Phase I oncology trials for identifying the maximum tolerated dose of therapeutics, its application in CHIM settings remains limited, particularly when the endpoint of interest is infection. This tutorial provides step-by-step guidance for implementing a Bayesian CRM in dose-finding CHIMs, using an oropharyngeal Neisseria gonorrhoeae challenge as a motivating case study. The framework outlines key design components, including dose-grid specification, dose-response model, prior elicitation, Bayesian updating, decision rules, and stopping criteria, with particular emphasis on a clinically interpretable parameterisation. Trial operating characteristics are evaluated through simulation studies under multiple dose-response scenarios and prior-predictive analyses, and compared with a commonly used '3+3' type rule-based design. This work highlights the advantages of Bayesian model-based designs for dose-finding in CHIMs over classic rule-based designs and provides a structured, reproducible framework for implementing CRM, supporting their application in future CHIM studies.

7

Comparative Efficacy of Vancomycin and Fidaxomicin Regimens for the Prevention of Recurrent Clostridioides difficile Infection: A Systematic Review and Network Meta-Analysis of Randomized Controlled Trials

Prosty, C.; Butler-Laporte, G.; Brophy, J.; Frenette, C.; Loo, V.; Coburn, B.; Hota, S.; Longtin, Y.; Kong, L.; Muller, M.; Steiner, T.; Valiquette, L.; Daneman, N.; Daley, P.; Nott, C.; MacFadden, D. R.; Kandel, C.; Chen, Y.; Perez- Patrigeon, S.; Lee, T. C.; McDonald, E.

2026-07-17 infectious diseases 10.64898/2026.07.14.26358112 medRxiv

Top 16%

0.0%

Show abstract

Background and Aims The optimal treatment for first episodes and first recurrences of Clostridioides difficile infections (CDI) is unknown and there is emerging evidence for pulse and taper (P-T) regimens. Therefore, we sought to estimate the relative efficacy of treatment options. Methods MEDLINE and CENTRAL were searched from database inception to May 21, 2025 and unpublished conference abstracts were searched from recent infectious disease conferences. RCTs on the treatment of first episodes or first recurrences of CDI comparing fixed-dose or P-T regimens of fidaxomicin or vancomycin were included. The primary and secondary outcomes were 40- and 56-day CDI recurrence, respectively. A random-effects network meta-analysis on the risk ratio (RR) scale was conducted using a standard regimen (10-14 days) of vancomycin as the comparator. Treatments were ranked using the surface under the cumulative ranking curve (SUCRA). Results 8 RCTs were included comprising a total of 2181 patients. For 40-day recurrence, fidaxomicin P-T had the highest probability of ranking best (RR=0.10, 95%Confidence Interval [95%CI]=0.10-0.49, SUCRA=1.00), followed by vancomycin P-T (RR=0.49, 95%CI=0.32-0.76, SUCRA=0.61), fixed-dose fidaxomicin (RR=0.61, 95%CI=0.49-0.76, SUCRA=0.39), and, finally, fixed-dose of vancomycin (SUCRA=0.00). The treatments ranked in the same order for 56-day recurrence, though only 3 RCTs reported on this timepoint. Conclusion Vancomycin P-T, fidaxomicin P-T, and fixed-dose fidaxomicin were all superior to a fixed-dose vancomycin. Head-to-head comparative effectiveness RCTs are needed to quantify their relative effect sizes of and impact on long-term prevention of recurrent CDI.

8

Nationwide Mpox Genomic Surveillance Reveals Clade Ib Introductions, APOBEC3-Driven Evolution, and Terminal Deletions

Brochu, H. N.; Shi, Q.; Song, K.; Zhang, Q.; Munroe, J.; Harris, N. J.; Britt, N.; Zeng, Q.; Kapuria, K.; Chappell, J.; Norvell, B. M.; Peavy, L.; Williams, J. D.; Harris, A. B.; Chaitram, J.; Hutson, C. L.; Deng, J.; McGrath, D.; Boles, D.; Dale, S. E.; Gigante, C. M.; Iyer, L. K.

2026-07-17 infectious diseases 10.64898/2026.07.15.26357894 medRxiv

Top 16%

0.0%

Show abstract

Background The 2022-2023 global mpox outbreak highlighted the critical need for robust genomic surveillance capabilities to track mpox virus (MPXV) evolution and transmission dynamics. Methods Building upon our established SARS-CoV-2 sequencing infrastructure, we implemented a Molecular Loop probe-based long-read sequencing approach using Pacific Biosciences Sequel II technology for comprehensive MPXV genomic surveillance across the United States (US). From August 2024 to June 2025, we generated 326 high-quality whole genome sequences from residual mpox-positive clinical specimens collected by Labcorp across all 10 US Department of Health and Human Services regions. Results Our analysis identified two samples containing clade Ib MPXV in January and June 2025 and captured shifting trends in clade IIb diversity, with 13 distinct lineages observed. We also identified multiple instances of large (~1.6-17.6kb) deletions proximal to the inverted terminal repeats in clade IIb genomes. APOBEC3 mutation analysis indicated substantial evidence of human-to-human transmission among both clades. Further, we observed significantly higher APOBEC3-associated SNPs per kilobase (P<0.001) in clade IIb genomic variable regions relative to their central conserved region. Our assay exhibited strong reproducibility across biological replicates from individual patients and accuracy was confirmed via parallel sequencing of select specimens by US Centers for Disease Control and Prevention (CDC) using metagenomic sequencing. We also demonstrated via custom simulation that our assay discriminates all known MPXV clades and lineages, including those we have not observed in the US. Conclusions Our integrated nationwide surveillance system facilitates real-time genomic tracking of outbreak evolution, with demonstrated capacity across SARS-CoV-2 and MPXV, positioning this platform for rapid deployment during future pathogen emergence.

9

Complex intra-host SARS-CoV-2 evolution following monoclonal antibody pre-exposure prophylaxis

Kamelian, K.; Pascall, D. J.; Cheng, M. T. K.; Meng, B.; Altaf, M.; Morse, R. M.; Aggio, J. B.; Egan, D. J. S.; Chen-Xu, M.; Trivioli, G.; Sutton, B.; Richter, A.; Gonzalez-Vazquez, L. D.; Cormie, C.; Kemp, S.; Yeadon, R.; Hyatt, B.; Wong, A.; Thesin Pelamkulangara, N.; Fraser, E.; McCarthy, B.; Novaes, F.; Stott, S.; Galvin, A.; Bellis, K. L.; De Angelis, D.; Harrison, E. M.; Martin, D.; Smith, R. M.; Gupta, R. K.

2026-07-17 infectious diseases 10.64898/2026.07.14.26356329 medRxiv

Top 16%

0.0%

Show abstract

Background: Monoclonal antibodies have emerged as a prophylactic strategy to prevent symptomatic SARS-CoV-2 infection in immunocompromised individuals. However, the evolutionary and clinical implications of breakthrough infections under this regime remain unclear. Methods: A male in their 80s with a haematological/oncological diagnosis received a 2000 mg intravenous infusion of sotrovimab in March 2023 and was diagnosed with COVID-19 by RT-qPCR from a nasopharyngeal swab in August 2023. Weekly samples (n=24) were collected through February 2024 (171 days). All samples underwent whole-genome sequencing, with select mutations subjected to functional assessment. Findings: Sequencing identified the GE.1 lineage at all timepoints. An intra-host recombination event in ORF1ab (positions 8942-12458) was detected prior to 23 weeks post-detection, followed by a 14-fold increase in viral load (7.42e+06 to 1.00e+08 RNA copies/mL) and a marked shift in the viral population. E340D, a sotrovimab resistance mutation, was detected at low abundance (46%) within the first week post-infection, fluctuated over time, and was nearly fixed by week 15 (107 days) post-detection. We assessed five spike mutations - V36M, S98F, and V213G in the N-terminal domain, Y505P in the receptor-binding domain, and P681Q near the S1/S2 cleavage site - and additionally evaluated the impact of E340D. V36M conferred the highest infectivity across all cell lines, with the most significant effect in low-TMPRSS2 cells. While all mutations showed enhanced infectivity with the addition of E340D, the effect was most pronounced in mutations with lower baseline infectivity. The addition of E340D significantly decreased relative neutralizing titres for V36M, S98F, and V213G, enabling escape from neutralizing antibodies in XBB-responsive individuals, illustrating an enhanced phenotypic advantage. Patient neutralizing activity was absent pre-sotrovimab, and sotrovimab-induced neutralization was further compromised by selection of E340D. Interpretation: Sotrovimab pre-exposure prophylaxis in an immunocompromised patient did not prevent SARS-CoV-2 infection, and selected for resistant mutation E340D, with unexpected fitness consequences across non-receptor binding domain spike regions.

10

Genome-Wide Association Studies and Deep-Learning Functional Annotation of Opioid Use Disorder across Three Ancestries in the All of Us Research Program

Gu, S.; Petrovitch, D.; Hall, O. T.; Lambert, J. W.; Kember, R. L.; Nahid, N. A.; Ma, Q.; Sprague, J. E.; McDonough, C. W.; Johnson, J. A.

2026-07-17 addiction medicine 10.64898/2026.07.15.26358096 medRxiv

Top 16%

0.0%

Show abstract

Background: Opioid use disorder (OUD) is heritable, yet most genome-wide association studies (GWAS) have focused on European populations, leaving the genetic architecture of OUD in non-European populations underexplored. Methods: We conducted GWAS of OUD across three ancestries using electronic health records and genomic data from 52,357 All of Us Research Program participants (8,912 cases; 43,445 matched opioid-exposed controls; 48.5% female). Participants were stratified into European (EUR), African (AFR), and Admixed American (AMR) ancestry groups for logistic regression GWAS, with independent replication in the Million Veteran Program. We then applied the deep-learning model AlphaGenome to predict the tissue-specific transcriptomic and splicing consequences of top risk variants across 13 reward-pathway brain regions. Results: We identified and replicated a novel DDX6 risk locus, alongside established OPRM1 and FURIN signals. AlphaGenome predicted the DDX6 regulatory allele downregulates the stress-resistance gene FOXR1 in the nucleus accumbens, while the protective OPRM1 variant (rs1799971) upregulates OPRM1 expression across reward networks. Other signals of interest included IL6R and SHISA9 (EUR); GHR (AFR); and ASTN2 (AMR). Conclusions: This study identifies DDX6 as a novel OUD risk locus, replicates associations with OPRM1 and FURIN, and highlights biologically plausible ancestry-specific signals in AFR and AMR populations. We also replicated top variants in an independent population. Finally, integrating GWAS with deep-learning annotations provides specific, localized biological hypotheses to guide future experimental validation and targeted therapeutics.

11

Efficient stochastic epidemic simulation via the Sellke construction

van Boven, M.; Bootsma, M. C.

2026-07-17 epidemiology 10.64898/2026.07.16.26358219 medRxiv

Top 16%

0.0%

Show abstract

Stochastic epidemic models are a cornerstone of infectious disease epidemiology and are often used to study intervention scenarios. However, large run-to-run variability can make intervention effects difficult to estimate precisely. We revisit the epidemic Sellke construction, which assigns each individual an infection threshold for the cumulative infection hazard such that, conditional on the thresholds, the epidemic trajectory becomes deterministic. This enables coupling of simulations with and without an intervention, yielding low-variance effect estimates even when outcomes such as final size or peak incidence vary widely between runs. We develop an exact, event-driven implementation that maintains infection and recovery events in priority queues. Cumulative infection-hazard updates require O(log N) time per event, yielding overall complexity O(Elog N) for E events in a population of size N. The implementation achieves computational performance comparable to the classical Gillespie algorithm while naturally accommodating non-Markovian infectious periods and complex infectiousness profiles. We illustrate the approach using distance-dependent spread of avian influenza between poultry farms in the Netherlands and a multilayer population with households, schools, and workplaces. In both examples, coupling enables efficient within-run comparisons of intervention scenarios across stochastic realisations.

12

Neonatal admission as a marker of risk for poor educational attainment and special educational needs in children aged 5-11 years

John, A.; Pike, C.; Olga, L.; Sovio, U.; Wong, H. S.; Smith, G. C.; Aiken, C.

2026-07-17 pediatrics 10.64898/2026.07.15.26358132 medRxiv

Top 16%

0.0%

Show abstract

Background: Children born prematurely (before 37 weeks) or admitted to the neonatal unit (NNU) are at increased risk of adverse long-term physical health outcomes. It is also recognised that there is an association with later academic performance and special educational needs, however it is not clear whether these broad risk factors could be used as stand-alone heuristics to identify children who may benefit from additional support in educational settings. We aimed to examine the associations between neonatal unit (NNU) admission and educational attainment in mid-childhood. Methods and Findings: Pregnancy data from a prospective birth cohort (Pregnancy Outcome Prediction Study, Cambridge, United Kingdom, 2008-2012) were linked to national educational outcomes (Department for Education, United Kingdom). Multivariable regression models adjusted for maternal, child, and socioeconomic factors were used to evaluate associations between (i) all NNU admissions, (ii) at term NNU admissions >48 hours, (iii) preterm birth without ongoing physical health needs, and educational outcomes at ages 5-11 years. Children who required any NNU care were more likely not to meet expected educational standards across multiple ages and domains in early and mid-childhood: age 5 early year foundation (aOR 1.64, 95% CI 1.19-2.27, p=0.003), phonics at age 6 (aOR 2.43, 95% CI 1.72-3.57, p<0.001), and at age 7 (here assessments were divided into multiple domains): reading (aOR 1.67, 95% CI 1.18-2.38, p=0.004), writing (aOR 1.72, 95% CI 1.25-2.38, p<0.001), mathematics (aOR 1.56, 95% CI 1.09-2.22, p=0.020), and science (aOR 1.85, 95% CI 1.22-2.78, p=0.003). Similar patterns were observed among both at term-born infants who stayed >48hrs in NNU (phonics assessment at age 6 aOR 2.26, 95% CI 1.51-3.36, p<0.001) and in children born preterm without long-term physical health sequelae (phonics assessment at age 6 aOR 3.07, 95% CI 1.96-4.81, p<0.001). These associations were robust to adjustment for demographic, perinatal, and socio-economic factors. By age 11, differences in academic attainment were attenuated and no longer clearly distinguishable across all exposure groups. However, there was an increased likelihood of special educational needs (SEN) at age 11 associated with any NNU admission (aOR 1.78, 95% CI 1.15-2.73, p=0.009), at term NNU admission for >48hrs (aOR 1.88, 95% CI 1.19-3.00, p=0.007), and children born preterm without long-term physical health sequelae (aOR 1.50, 95% CI 1.00-2.25, p=0.049). Predictive performance of any NNU admission for SEN at age 11 was moderate (AUC 0.70, 95% CI: 1.14-2.65, p=0.010), with balanced sensitivity and specificity and high negative predictive value. Conclusions: NNU admission, for both term and preterm infants, is associated with poorer educational outcomes and an increased likelihood of special educational needs in mid-childhood.

13

Comparing Human and Large Language Model Responses to Patients Online Questions: Towards Multi-dimensional Patient-centered Support

Hussein, M. A.; Doshi, R.; He, L.; Reynolds, T.

2026-07-17 health informatics 10.64898/2026.07.15.26355314 medRxiv

Top 16%

0.0%

Show abstract

Patients and caregivers seek informational and emotional support throughout medical care, especially when interpreting unfamiliar laboratory test results. Although resources such as patient portals and online health communities (OHCs) help address questions, gaps remain. The emergence of large language models (LLMs) offers the potential to be a complementary source of support to assist patients and caregivers in understanding and using their test results. The objective of our study is to empirically compare LLM responses to patients online questions containing their laboratory test results to responses written by peers in an OHC. We compared the 519 peer replies to 122 laboratory test-related posts from an OHC to 488 responses generated from four LLMs using mixed computational and qualitative methods. LLMs frequently provided clear explanations of medical terminology and structured interpretations of numeric results but were longer and less readable. Peers offered more personalized, context-specific emotional support. Overall, LLMs have the potential to complement peer responses in OHCs, but require greater emotional depth, reasoning transparency, and alignment with community norms.

14

General Practice Perspectives on Post-Infection Conditions: Scoping Review and UK Survey

Aung, K. W.; Scuffell, J.; Podlasek, A.; Engamba, S.; Jones, F.; Edwards, A.; Chew-Graham, C. A.; Sanyaolu, L.; Busse-Morris, M.

2026-07-17 primary care research 10.64898/2026.07.15.26358157 medRxiv

Top 16%

0.0%

Show abstract

Background Post-infection conditions (PICs), such as Long Covid, are associated with heterogeneous, fluctuating symptoms that profoundly affect daily functioning. Despite moderate-certainty evidence from the NIHR-funded LISTEN trial (COV-LT2-0009) that personalised self management support improves outcomes and may reduce societal and economic impacts of Long Covid, many people living with PICs still receive condition-specific services, generic advice, or stand-alone digital tools that do not address their complex needs. Aim To map care approaches in general practice and synthesise UK evidence for PIC management. Design and setting Scoping review and online survey. Method A two-phase study was conducted: (1) a scoping review of UK evidence on PIC management in general practice; and (2) a supplementary online survey of practitioners working in UK general practice to provide contextual insights. Results The scoping review identified 32 studies focused on Long Covid. One study included a comparator group (ME/CFS). Study populations were predominantly white ethnicity and female. Evidence for non-Covid PICs in UK general practice was largely absent. The supplementary survey (n=46) provided preliminary practice-level insights. Healthcare practitioners reported varied PIC presentations, diagnostic uncertainty, limited referral pathways, inequitable access, and low confidence in managing PICs. Conclusion Evidence informing PIC management in UK general practice remains predominantly Long Covid-focused and may not reflect the range of PICs encountered in practice. While survey findings are preliminary and require confirmation in larger samples, they highlight uncertainty around PIC management. Further research is needed to evaluate whether existing Long Covid pathways should be expanded or complemented by broader PIC models. Keywords general practice; Long Covid; self-management; post-viral syndromes

15

Temporal relationships between distress and pain in people living with HIV

Arendse, G.; Kamerman, P.; Wadley, A.; Edwards, R. R.; Joska, J.; Parker, R.; Madden, V. J.

2026-07-17 primary care research 10.64898/2026.07.15.26358133 medRxiv

Top 16%

0.0%

Show abstract

Objective: There is a bidirectional relationship between emotional distress and pain. However, this relationship is understudied in people with HIV in low-resource settings. This study sought to describe the temporal relationship between emotional distress and pain in people with HIV. Design: Longitudinal observational study. Methods: Participants with virally suppressed HIV, reporting either no pain or persistent pain at baseline, provided weekly remote ratings of distress, worst pain, and average pain using 0-10 visual analogue scales. Within-individual fluctuations in distress and pain were visualised over time. Group-level correlations were determined using Spearman's correlation tests. Cumulative link mixed models assessed whether distress and pain each predicted the other in the following week. Results: 72 participants provided responses over 49 weeks. The participants had a median (IQR) age of 43 (37-51) years, 63% (n=45) were unemployed and most were females (n=51;71%). Distress and pain fluctuated concurrently within individuals: distress was positively correlated with worst pain ({rho}=0.66, 95% CI= 0.60-0.72, p<0.001) and average pain ({rho}=0.70, 95% CI=0.64-0.75, p<0.001) intensity within the same week. Worst pain (OR=1.42, 95% CI=1.17-1.71, p<0.001) and average pain (OR=1.43, 95% CI=1.20-1.71, p<0.001) intensity both predicted distress in the next week. Distress predicted worst pain intensity (OR=1.25, 95% CI=1.07-1.46, p=0.023) but not average pain intensity (OR=1.19, 95% CI=1.01-1.40, p=0.152) in the next week. Conclusions: The temporal relationship between distress and worst pain intensity was bidirectional, whereas distress did not temporally predict average pain intensity. Both pain and emotional distress should receive attention from HIV research and clinical care in low-resource settings.

16

Trends and variations in Lithium usage across care settings in England between 2015-2024

Schiffer, H.; Fisher, L.; Curtis, H. J.; Wood, C.; Brown, A. D.; Bacon, S. C.; Croker, R.; Goldacre, B.; MacKenna, B.; Speed, V.; Macdonald, O.

2026-07-17 psychiatry and clinical psychology 10.64898/2026.07.15.26357641 medRxiv

Top 16%

0.0%

Show abstract

Lithium has been the gold standard for the treatment and prevention of relapse in bipolar disorder for over 60 years. Guidance from the National Institute for Health and Clinical Excellence states explicitly to 'offer lithium as a first-line, long-term pharmacological treatment for bipolar disorder'. Yet, in the last two decades its use has been in decline with clinicians favouring anticonvulsants or antipsychotics when treating this condition. In this study, we have used three openly available datasets containing prescribing data from primary and secondary care to explore trends in the use of lithium in England, showing both regional and temporal variance between 2015-2024. We have shown that lithium use declined in primary care by 20.9% in the last ten years (2015-2024) and 10.9% overall in the last five years (2019 to 2025). We have also shown how there is some regional variation in the source of lithium for patients, although the vast majority is prescribed in primary care. Further research into clinical behaviour is needed to understand what is driving the decrease in lithium usage, and what barriers and enablers may influence its use across the country.

17

Human GPR174 deficiency drives polyclonal lymphoproliferative disease via defects in T cell function

Huang, Y.-H.; Arana, K.; Rachimi, S.; Tam, H.; Spegarova, J. S.; Engelhardt, K. R.; Griffin, H.; Mee, M.; Miano, M.; Raggi, F.; Grossi, A.; Rusmini, M.; Ceccherini, I.; Dell'Orso, G.; Ferro, J.; Giarratana, M. C.; Pillai, V.; Banka, S.; Garcez, T.; Briggs, T. A.; Mellouli, F.; von Hardenberg, S.; Beier, R.; Auber, B.; Baumann, U.; Tawamie, H.; Behrens, E.; Oldridge, D. A.; Cabrera, E. C.; Xu, Y.; Ouyang, S.; Hambleton, S.; Romberg, N.; Cyster, J. G.

2026-07-17 rheumatology 10.64898/2026.07.14.26357774 medRxiv

Top 16%

0.0%

Show abstract

The X-linked G-protein coupled receptor GPR174 is highly expressed in T and B lymphocytes and has immunoregulatory roles in mice, but its function in humans is unknown. We describe a cohort of six individuals who have function-disrupting variants in GPR174 and a clinical phenotype of lymphadenopathy and autoimmunity. Histological analysis of two patient lymph nodes revealed necrotizing lymphadenitis and lymphoproliferation resembling Kikuchi-Fujimoto disease. In-depth analysis of three patients and related carriers revealed overaccumulation of CD8 terminally differentiated effector memory cells re-expressing CD45RA (TEMRA). Patient cells and GPR174-deficient CD8 T cells generated from controls showed less repression of proliferation by the GPR174 ligand lysophosphatidylserine (lysoPS) and an effector-biased gene expression program. GPR174-deficient CD4 T cells were resistant to lysoPS-mediated suppression of IL2 production. In mice, chronic viral infection led to over-accumulation of GPR174-deficient effector CD8 T cells. We describe an inborn error of immunity associated with dysregulated lymphocyte responses that we propose predisposes to exaggerated lymphoproliferation and autoimmunity following viral infection.

18

Large Language Model - Enhanced Decision Tree Framework for Identifying Multiple Sclerosis Diagnoses from Clinical Documentation

Venkatesh, S.; DelSignore, M.; Wu, X.; Morris, M.; Kerr, W. T.; Visweswaran, S.; Wang, Y.; Xia, Z.

2026-07-17 neurology 10.64898/2026.07.14.26357416 medRxiv

Top 16%

0.0%

Show abstract

Background. Early diagnosis and intervention are crucial in multiple sclerosis (MS), yet diagnostic delays are common. Large language models (LLMs) such as generative pre-trained transformers (GPTs) may help streamline diagnostic workflows by extracting MS diagnostic signals from clinical notes. Objective. To derive MS diagnosis status from the first neurology note using a computable algorithm based on the 2017 McDonald criteria and applying GPT-4 for node-level reasoning within a structured decision framework. Methods. We analyzed first neurology notes from 125 randomly selected patients (including those with MS, related disorders, and controls) enrolled in a clinic cohort between 2017 and 2023. We included the clinical history and diagnostic testing sections but redacted the assessment and plan. We converted the 2017 McDonald criteria into a decision tree and provided expert-curated clinical knowledge to guide GPT-4 reasoning at each decision node. GPT-4 generated binary decisions at each node to traverse the tree and classified MS diagnoses at terminal nodes. We evaluated performance against neurologist-assessed diagnoses and characterized hallucinations (non-factual, incongruent, irrelevant, over-reliant, and logical reasoning errors). Results. In this study cohort (mean age 40{+/-}13 years; 81% women) representative of the clinic population, GPT-4 performed well in predicting MS diagnosis (84% accuracy, 79% precision, 74% recall, 91% specificity) using first neurology notes. Hallucinations occurred in 32 cases (26%), most commonly incoherence (75%) and overreliance (47%). Conclusion. A structured, LLM-guided decision framework can flag MS diagnoses from early clinical documentation. Large-scale studies are needed to mitigate hallucinations, validate this approach, and test implementation in clinical settings.

19

Scaling ECG Foundation Models and Identifying a Threshold for Effective Representation Learning

Sriram, R.; Nenadic, I.; Shahrabani, E.; Goonewardena, S.; Yao, S.; Farrell, B.; Loring, Z.; Murthy, V. L.

2026-07-17 cardiovascular medicine 10.64898/2026.07.15.26358182 medRxiv

Top 16%

0.0%

Show abstract

We conducted a scaling evaluation of unlabeled pretraining for electrocardiogram foundation model performance. One-dimensional vision transformer masked autoencoders were pretrained across increasing ECG volumes and fine-tuned for rhythm, morphology, diagnostic, and structural heart disease tasks. Models pretrained below 400,000 ECGs failed to consistently exceed controls without self-supervised pre-training, whereas 600,000 to 800,000 ECGs improved AUROC across tasks, suggesting a minimum threshold for effective ECG representation learning.

20

FootNet: A Multi-View Smartphone Dataset and Four-Model Benchmark for Clinical Foot Segmentation

Vijay, A.; Prabhune, A.; Srihari, V. R.; Rayampalli, A.

2026-07-17 health informatics 10.64898/2026.07.15.26358117 medRxiv

Top 16%

0.0%

Show abstract

We present FootNet, a 453-image multi-view smartphone foot dataset for binary foot segmentation, with expertannotated masks across six anatomical views (dorsal, medial, and plantar, both left and right). We benchmark four segmentation models under a controlled protocol: U-Net with a MobileNetV2 encoder achieves the best performance (IoU 0.9268, Dice 0.9608, 95 % CI [0.9209, 0.9320]); DeepLabV3 with MobileNetV3-Large scores IoU 0.8984 (Dice 0.9449); UNet++ with MobileNetV2 scores IoU 0.8913 (Dice 0.9391); and SAM ViT-B with oracle boundingbox prompt scores IoU 0.9219 on the matched 191-image subset. Bonferroni-corrected Wilcoxon signed-rank tests (k = 6 comparisons) show U-Net significantly outperforms DeepLab (p < 0.001, r = 0.638) and SAM ViT-B with oracle boundingbox (p = 0.005, r = 0.202); UNet++ does not significantly differ from DeepLab (p = 0.062). Connected-component postprocessing yields negligible benefit (mean {triangleup}IoU = +0.0003, 12 of 453 images improved). The extended dataset is available upon request